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Estimation of Latent Ability and Item Parameters 
When There Are Omitted Responses 

Abstract 

Omitted items cannot properly be treated as wong when estimating 
ability and item parameters. A convenient method for utilizing the 
information provided by omissions is presented. Some theoretical and 
considerable empirical justification is adduced for the estimates obtained 
by both old and new methods. 



ERIC 



Estimation of Latent Ability and Item Parameters 
Vnien There Are Omitted Responses^ 

At Lhe time the likelihood equations for item characteristic curve 
(ice) theory were written down [Lord, 1155], there seemed to be tliree major 
obstacles to practical applications: 

1^ Solution of the likelihood equations for data of real interest 
did not seem practicable from a computational point of view 
[Torgerson, 1958, p. 588]- 
2. Ice theory dealt with unspeeded tests, whereas almost all 
standard tests are administered with a time limit that 
prevents some examinees from finishing. 
5' Ice theory was first developed for dichotomous items. 

Typical test answer sheets, however, carry at least three 
distinct types of examinee response: correct response, 
wrong response, no response ("omits")* These three types 
of response often receive different scoring weights (for 
example, 1, - ^ , and 0, respectively), but even if 
omitted responses are scored as wrong they can not rea- 
sonably be treated as wrong in the likelihood equations. 
Solution of equations > Numerical solutions to the likelihood equa- 
tions have now successfully been obtained for large data sets [Lord, 1968; 
Bock & Lieberman, 197O; Bock, 1972]. Even the maximum likelihood estimate 
of the parameter representing the ice lower asymptote, sometimes incor- 
rectly called the chance level, can now be obtained by maximum likelihood 

•^Research reported in this paper has been supported by grant GB-5278IX 
from National Science Foundation. 
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[Wingersky & L'jrd, 1975] whenever there is enough data to determine this 
part of the curve* 

Speededness, Time-limit tests are, for some examinees, partly a 
measure of something called speed, which is quite distinct from the ability 
measured by the power score that would be obtained if the test were adminis- 
tered without time limit, icc theory is not presently equipped to deal v/ith 
the speed dimension explicitly, but the theory can still be used to analyze 
answer sheets obtained' in timed test administrations. To do this requires 
the assumption that examinees answer test questions in order. For each 
examinee, the items following his last recorded response (this item is 
called the last item attempted) are ignored throughout the estimation process, 
(in practice, examinees answering less than a third of the n items 
are omitted from all analyses.) Thus examinee ability 9 is er,timated 
from his responses to items actually attempted and item parameters are 
estimated fiom the responses of examinees who attempt the item. 

This does not complicate the likelihood equations or the process of 
solving them. A key property of icc theory is that item parameters do not 
depend on the group of examinees tested, within reasonable limits; and that 
examinee ability ( 0 ) does not depend on the items administered, assuming 
that all items measure the same psychological dimension. Thus, ignoring 
various examinees and various items ought not to have serious effects. 

It is true that a 0 estimated in this way will approximate the 
examinee's ability under power conditions only if his responses to the 
items 'actually attempted would have been the same in the absence of a 
time limit. Regardless of this, a 0 estimated in this way appropriately 
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reflects the examinee *s effective ability level under the timed conditions 
actually provided. 

If an examinee does not reach an item because of the time limit, 
this fact contains no usable information for inferring his ability level 
0 under the ice model. The one -dimensional ice model considered here 
provides no way to make use of any relationship (ordinarily curvilinear) 
that may exist between speed of response and 9 • The term omitted 
response or simply omit will hereafter refer only to items actually reached^ 
not to items after the "last item attempted.** 

Omitted items * The^.pres^ent paper proposes and discusses a method 
for the effective use of the information represented by omitted responses 
when estimating ability and item parameters. Another method has been 
proposed and used by Bock [1972]. V/ith the implementation of adequate 
methods for dealing with omitted responses^ the application of ice theory 
to typical testing data is now effective and practical. 

Omitted H^^sponses 

The meaning of an omitted response varies depending on the type of 
item and how the test is scored. It will be assumed throughout this paper 
that the items -^re multiple choice* If the score is the nimiber of right 
answers, to be denoted by x , then the examinee who omits any item is 
acting against his ov;n best interests. We will not consider such cases 
here. 

The only common alternative to the number-right score is the formula 
score containing a penalty for wrong answers. If A is the number of 



alternative responses provided for the test item, the usual formula score 
is 

where v; is the number of wrong responses given by examinee a . 
a 

V/e v;ill assume hereafter that (l) is used; also that examinees V7ish 
to maximize their expected scores and that they are fully informed about 
their best strategy for doing this. Under these conditions, an examinee 
should omit an item only if he believes his chance of success on the item 
is no greater than c 2 i/a . On the other hand, since the item has A 
alternative responses, his chance of success cannot be less than c ^ 
since he can alv;ays do this veil by strictly random guessing* Following 
this reasoning, we will assume hereafter that if an examinee were required 
to respond to a long series of A -choice items that he had omitted, his 
proportion of correct ansv/ers v/ould be, c . [Slakter ^1968) presents 
empirical evidence that exaninees omit more items than they should according 
to this assumption; however, his examinees v/ere not explicitly instructed 
as to their best strategy.] 

The ICC Model 

In item characteristic curve (ice) theory, the probability that 
examinee a will ansv/er item i correctly may be denoted by P^^ = ^l^^a^ ^ 

here assumed to be an increasing function of his ability © . It might 

a 

seem from the preceding paragraph that P. (9 ) = c whenever examinee a 

1 a 

is required to answer an item i that he previously had omitted. This 
cannot be correct, hov/ever* If examinee b omits the same item^ v/e v/ould 



have P.(0, ) c , from v/hich it wouid follov/ that 0 = ©, . Since two 
1^ b a D 

examinees v/ho omit the same item may be at very different ability levels^ 

it is clear that the probability c is a different kind of probability 

than P. . 
la 

It might seem natural to think of P^^ as the relative frequency of 
correct, answers v/hen item i is repeatedly administered to examinee a 
under some h^^pothetical conditions requiring him to forget his previous 
responses. This interpretation of P^^ is considered in detail by 
Meredith [1965]. We cannot use this interpretation here (nor in most 
practical work). Examinee a might kiiov; the answer to item i - 1 and 
so have a probability of 1 of ansv/ering it correctly. At the same time^ 
he might be misinformed about item i - 2 and so have a probability of 0 
of ansv.'ering it correctly. At the same time^ er.aminee b might have 
probability 0 of answering item 1 correctly and probability 1 of ansv/ering 
item 2 correctly. If the items measure Ihe same trait^ the four eq^Jiations 

Pj^(9^) - ^2^^^ " ^ h^^a^ " ^l^^b^ " ^ ^^^y difficult to 

reconc i le . 

^ia ^ost simply intei-i.x'eted as the probability that examinee a 

V7ill give the right ansv/er to a randomly chosen item having the same ice 

as item i . An alternative interpretation is that P. (0 ) is the proba- 

1 a 

bility that item i v;ill be ansv;ered correctly by a randomly chosen exam- 
inee of ability level 9^9 . These tv/o interpretations will usually be 
assumed to hold simultaneously. 

These interpretations tell 'us nothing about the probability that a 
specified examinee v;ill answer a specified item correctly. It is this 



last probability that v;ould equal c if an examinee were required to 
answer an item he has omitted. 



The Likelihood Function 

Let the response of examinee a to item i be denoted by u. . For 

la 

a correct response, let u. = 1 ; for an incorrect response* let u = 0 . 

la la 

I'Jhen examinee a answers a test composed of n items, the u. are assumed 

la 

independent (assumption of local independence ). If he does not omit any 
items, the likelihood function for his responses can be v/ritten 

n u. 1-u. 

(2) L fu, ,u^ ,...,u |0 ) = n 

a^ la' 2a^ ' na* a'^ . , la ^la 

1=1 

v;here 0 5 1 - P . If the item parameters in P. that characterize 
id la 3_a 

item i are knov/n (approximately, from pretesting), then the maximum like- 
/\ 

iihood estimate 0 of the examinee's ability can be obtained from this 
a 

likelihood function by standard procedures. 

♦ 

If the item parameters are unknov/n, they can be estimated at the same 
time as 0^ from the responsej; of raany (preferably tv;o or three thousand) 
examinees. In this case, the livelihood function is 

H n u. l-u. 
(5) L(u!0) =- JI JI p/^-. ^ 

a--'l i=i 

where U is the matrix i|u, || and 0 is the vector i0, ,0^, ...,0 \ . 

la ^ ' 1 2 ' Nf 

The maximum likelihood estimates of 0 and of the item parameters can 
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actually be obtained in practice from this likelihood function by standax'd 
procedures (somewhat surprisingly, in viev; of the very large number of 
parameters to be estimated). The practical effectiveness of this procedure 
is being demonstrated in work with real data (for example, LDrd, 1970 ) 
despite the (presumably temporary) lack of a rigorous proof that the 
maximum likelihood estimates are consistent* 

If the examinee omits certain items, it might seem that one could 
simply omit these items altogether from (2) or (5) and proceed as before. 
This cannot be right, hov;ever, since the fact that the examinee omitted 
certain items carries the important infomation that 1 aid not know the 
answers to these items — that his chance of success was roughly only c on 
each. V/e cannot afford to ignore this information* If we did, an examinee 
could obtain as high a 9 as he wished, simply by omitting questions 
whenever he v/as not completely sure of the correct answer. 

One way to df^al v;ith this situation would be for the psychometric ian to 
replace each omitted response by a response dravm at random with probability 
of success c . After all, this is .-just what some examinees do in actual 
practice, instead of omitting items* According to the model, the likeli- 
hood function (2) or (3) will hold for data so obtained. * 

Although this procedure should yield consistent estimates (as n -♦^ ), 
it is objectionable from tv;o related points of view. From the examinee's 
point of viev/, it is unfair to saddle him with a possibly unfortunate set 
of random responses. From the statistician's point of viev;, the procedure 
degrades the data by introducing random error; it can only increase error 
variance, it cannot possibly be truly beneficial. 
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It would be desirable to replace (2) or (5) by a likelihood function 
that includes provision for omitted responses. Such a function, however, 
would depend in part on the true probability that examinee a will omit 
a randomly chosen item having the same item parameters ^s item i 
( i 1,2^ ...^n ). This true probability would be a function, similar to 
Pj^C^g^) but not the same, depending not only on 0^ and on certain 
characteristics of the item, but also on a new trait of the examinee 
representing his willingness to omit items* Even after simplifying 
assumptions, there would be at least one new examinee parameter and one 
new item parameter to estimate, considerably complicating the already com- 
plicated and expensive estimation procedure. 

Hew Estimation Procedure 

The following estimation procedure has been used on several large sets 
of data, apparently with great success, as briefly indicated in the next 
section. The likelihood function (") is replaced by 

1^1 

where v^^ 1, 0, or c according to whether the response is right^ wrongs 
or omitted^ respectively. Since for the present the item parameters must be 
estimated at the same time as 0^ , the responses of many (preferably two 
or three thousand) examinees are analyzed simultaneously, so that (k) is 
replaced by 

ERIC 
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N n V. 1-v. 

(5) L^(v|o) = n n p f q , 

v/nere V is the matrix 11^^^^ II • the new procedure as actually used, 

vie find the values of 0 for the N examinees and the values of three 

a 

item parameters for the -n- items that maximize (5)' These values are 
taken as the parameter estimates desired. 

Since (k) and (5) are not likelihood functions* these estimates are 
not maximum likelihood estimates. The estimate of 0 from (k) or (5) 
will be denoted by 0^ . It is shown in the Appendix that in the case 
of (k)^ the &^ converge for large n to the same values as do the 0 
obtained from (2) after omits have been replaced by random responses. 
Moreover, if there are omits, the sampling error of 0^ for large n is 
less than the sampling error of the maximum likelihood estimate obtained 
from the degraded data. 

Discussion of Assumptions 

Many people prefer number-right scoring to the formula scoring con- 
sidered here. Some common misinterpretations will be avoided by i^ointing 
out that certain assumptions are not made here. 

1. There is no assumption here that formula scores are superior 
to number-right scores. If number-right scores were used, the 
problem considered here should not arise, since in that case 
examinees should not omit any items at all. 
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2. There is no assumption here that examinees guess at random 
v;hen they do not know the answer to an item. On the con- 
trary, many of the item characteristic curves found so far 
in the analysis of nationally used tests show that * 
ability examinees tend to do less well on difficuK j.^ems than 
they would have done if they had responded at random. This 
situation presumably arises because certain of the possible 
item responses have been cleverly made so attractive that low- 
level examinees tend to choose them in preference to the 
cor>"ect answer. 

5. The model given here is consistent with the obvious fact 
that examinees use misinformation and partial information 
in ansv/er^ '^i^' item . F-^r a few items in nationally adminis- 
tered tests, the ice never go below P^(@) =^ .50 or 
P^(0) - .kO y regardless of O level. This suggests that 
on some items even very low-level examinees may be able to 
rule out two or tLree of the possible item responses as 
incorrect. 

It is assumed that the probability of a correct ansv/er v/ould be 
c = 1/a if an examinee v/ere required to respond to the A -choic<^ items 
he has omitted. As explained previously, this assumption is made because 
an examinee who wants to maximize his expected score should never omit an 
item if he can do better than choose among the A responses at random. 



-11- 



Empirical Results 

Sever '7,^ quvotions can be raised about the various parameter estimates 
that have been discussed. In the first place, many people distrust the 
assumptions cf the ice model, particularly the assumption that there is 
only one dimension 0 underlying the test. The best way to resolve thi.s 
question is not to try to prove that the assumptions hold for a particular 
set of data (they will never hold exactly), but to show that the parameter 
estimates obtained provide a useful and effective summary of the data^ 
capable of predicting new sets of data not yet observed. Tho main purpose 
of this section is to show just that, insofar as possible with the limited 
investigations made to date. 

M^^ximum likelihood estimates obtained from (5) are open to a further 
objection--there exists no rigorous proof that these estimates are con- 
sistent. ^ related but distinct problem is that it may seem hard to 
believe that several thousand parameters can really be successfully esti- 
mated simultaneously. It would be valuable to have a mathematical proof 
of the asymptotic properties of the estimates. Any final answer to both 
questions, however, must come by demonstrating the usefulness of estimates 
obtainable in practice from samples of reasonable size* 

Estimates obtained from (h) and (5) are open to further objections. 
Equations (k) and (5) are not likelihood functions. No clear statistical 
justification has been given for choosing these functions to be maximized 
(there seems to be an interesting and unanswered question in statistical 
inference here). Some justification of the estimates obtained are 
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2. Relation of Estimated Ability to Test Score 

Figure 1 shows for test m the relation of the ability estimates 9^ 

a 

obtained from (5) to formula score on the total test* Under the proba- 
bilistic ice models scatter of scores about the test characteristic curve 
(the regression of test score on ability) is to be expected^ because of 
sampling fluctuations provided for in the model. The correlation ratio 
of formula score on 0 was .978* For test v ^ a correlation ratio of 
.982 was computed, but coarse grouping of formula scores makes this 
somewhat too low. 

3* Predicting Test Score from Estimat - 3d Ability 

For given 0 , the expected numb'^r-right score of examinee a is 
a 

n 

(6) ex^ = p.(0J 

n 

a 

where E represents summation over all items actually answered. The 

a 

correlation betv/een E Fi'fO*) and number-rv^ht score corrected for omits 

1^ a' ^ 

was obtained for tests V and m . Here P|(&^) represents ^^{^q) with 

parameter estimates from (5) substituted for the unknown values. The 

number-right score corrected for omits is x o /A , where o is 

a a' ^ a 

the number of omitted items. For 60-item m , the correlation v/as .98I; 
for 90-item V y .992. 

These high correlations show that the estimated parameters summarize 
the data on the examinees* ansv/er sheets very effectively. 



-13- 

2. Relation of Estimated Ability to Test Score 

Figure 1 shows for test m the relation of the ability estimates 9^ 

a 

obtained from (5) to formula score on the total test* Under the proba- 
bilistic ice models scatter of scores about the test characteristic curve 
(the regression of test score on ability) is to be expected^ because of 
sampling fluctuations provided for in the model. The correlation ratio 
of formula score on 0 was .978* For test v ^ a correlation ratio of 
.982 was computed, but coarse grouping of formula scores makes this 
somewhat too low. 

3* Predicting Test Score from Estimat - 3d Ability 

For given 0 , the expected numb'^r-right score of examinee a is 
a 

n 

(6) ex^ = p.(0J 

n 

a 

where E represents summation over all items actually answered. The 

a 

correlation betv/een E Fi'fO*) and number-rv^ht score corrected for omits 

1^ a' ^ 

was obtained for tests V and m . Here P|(&^) represents ^^{^q) with 

parameter estimates from (5) substituted for the unknown values. The 

number-right score corrected for omits is x o /A , where o is 

a a' ^ a 

the number of omitted items. For 60-item m , the correlation v/as .98I; 
for 90-item V y .992. 

These high correlations show that the estimated parameters summarize 
the data on the examinees* ansv/er sheets very effectively. 




lERlC, 



-15- 

Again^ Bome scatter of scores about their expected value^ due to 
sampling fluctuations^ is provided for in the model. The scatter is 
less than it would be if true parameters had been used instead of estimated 
parameters. The reason is that chance irregularities in the data are 
to some extent fitted in the course of the estimation process. Cross 
validation procedures^ using a second random sample of data^ could be 
used to eliminate this. 

k. Comparing Estimates of the Distribution of Ability 

The histogram in Figure 2 shov/s for the I80T examinees vho answered ' 

the last item in test m the frequency distribution of 9 obtained from 

>\ 

(5). The smooth curve shov/s an estimate h(9) oJ the frequency distri- 
bution of 9 (not 9 ) obtained by the method outline.! below. The tv/o dis- 
tributions are obtained in very different ways^ uxider totally different sets 
of assumptions^ as detailed in Lord [1970]^ where ice were estimated by the 
two different methods and compared. 

In order to obtain h(9) ^ the frequency distribution g(i) of 
true score i v/as estimated frotn the observed distribution of number- 
right scores under the compound -binomial error model [Lord^ 1969]. By 
definition^ true score is e:<pected observed score^ so by (6) 

(7) =rP,(9j . 

When the item parameters are known^ this equation defines | as a function 
of 9 ^ or 9 as a function of | . Thus the distribution of ability 
h(9) was estimated for the group of I807 examinees by first estimating 
the distribution from the distribution of their number-right scores 



-16- 




Figure 2. Distribution of Estimated G (histogram) and Estimated 
Distribution of 9 (carve). 
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and then transforming this distribution to that of 9 by the functional 
relationship (7) with estirr-^ted item parameters substituted for their 
unknown true values. 

Th" discrepancies between the two distributions in Figure 2 occur where 
they should* The 0 properly show a slightly more dispersed distribution 
than the 9 (represented by the smooth curve )^ since the 9 contain 
errors of estimation* Because m is a difficult test, these errors are 
quite large for low ability examinees^ as discussed in an earlier section- 
In view of the very different assumptions made, the excellent correspondence 
shown in Figure 2 is strong evidence for the meaningfulness and practical 
usefulness of the models and estimation procedures used to obtain these 
results. 

^. Estimated Frequency Distribution of Number-Right Scores 

According to the ice model, the probability generating function 
[Kendall & Stuart, 1958, section I.57] for the frequency distribution of 
number-right scores is 

n « n 

(8) Z 0(x) t'^ f / JI [Q (9) I t P. (9)] h(9) d9 . 
x=o / i=l ^ 

Using the estimated h(9) in Figure 2 and using estimated item parameters 
in P^(9) and Q^(9) , the frequency distribution of number -right scores 
was estimated from (8). The resulting $(x) agreed to at least two decimal 
places with that obtained under the compound-binomial error model. The 
latter $(x) agrees well with the actually observed distribution of scores, 
the calculated chi square being near the 30-th percentile of the 
distribution for 5^ degrees of freedom. 
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6* Correlation of Mew Estimates and lALE 

For test M , correlations were calculated between item parameter esti- 
mates obtained by the two methods (eqs. 5 and 5). For the lower asymptote 
( ) of the icc^ the correlation v;as .990* For the discriminating power 
( a^ ) the correlation was .995- For the difficulty parameter ( b ) ':he 
correlation v/as .9996* These results show that for purposes of est.LmatLig 
item parameters^ the new estimation method yields results virtually equi/a- 
lent to a maximum likelihood procedure based on the usual ice model and 
filled-in observations. 

Tests V , V , m ^ and M are all difficult for low ability examinees- 

As a result^ the ability parameter 0 cannot be estimated accurately at 

the lowest levels--one cannot effectively distinguish between 0 = -5 

a 

and 0^ = -500 . This makes no practical difference as long as the problem 
is to predict the performance of the examinee on other tests that are about 
eq-jally difficult for them. VJhen the examinees with estimated 0 *s 
below 5*0 are omitted, the correlation betv/een estimated 0 's obtained 
from (5) and (5) is .997 for the remaining 2867 examinees. 

7* Likelxaoods 

For tests M and V , a new set of data was set up for cross- 
validation by replacing the omits by a second set of random responses 
chosen independently of the first. No parameters were estimated from 
these cross-validation data. Instead, the estimated log L was evaluated 
for these data, using estimated parameters in place of the unknown true 
values in (5). 



-19- 

For test M , when the estimated parameters were MLE's obtained from 
the original degraded data, the estimated log L of (5) for the cross- 
validation data was -525^8, approximately; when the estimated parameters 
were obtained from the original undegraded data by (5), the estimated log L 
of (")) for the cross-validation data was -52295- Thus the new method (^) 
seems to provide better estimates of the parameters of the conventional ice 
^o^el (higher likelihood for the cross-validation data) than does the 
conventional method itself . 

The test V results support the same conclusion. V/hen the estimates 
were MLE^s obtained from (5), log L in (3) for the cross-validation data 
was -58277; when the estimates v;ere obtained by (5)^ log L in (5) for the 
cross -validation data was 
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Appendix A 

We are given answer sheets v/ith three kinds of responses: rights, 
omits, and wrongs, which will be denoted by = 1, c, and 0, respectively* 
This appendix deals with the new estimation procedure for the case v/here 
there is just one examinee, the item parameters having been already 
determined. 

Rewriting (k) with the subscript a dropped from most of the symbols, 
v/e have 

n V. I-v. 

i=l 

Our estimate of the ability of the exarr.inee tested is 9^ , the value of 
Q that maximizes (Al). 

If v/e replace all omits by randomly assigned responses, the likelihood 
function under the ice model is 

n u. l-a. 

(A2) L^(u^,ap, ...,u^ !0) ;; p/q. ^ 

i „ i ^ 

v/h^re ^- 1 or 0 . This equation iz tin- ^amo as (^) except that a has 
been dropped from mor;t of the .-subscripts* ?hv MLE 0 of the f*xaminee's 
ability obtained ll^om (A2) is justified b;^' the ice modoi. The empirical 
results given in the last section suggest that 0'^- may be a superior 
estimator to 0 . 11 is most d'^-sirabic, however, to have mathematical 
proofs of some of the properties of 0'^* . No such proofs have been given 
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so far. The p'orpose of this Appenaix is to indicate some relationships 



between 0^ and 0 that hold for large n • Tl will be shown that 
is a consistent estimator with a sampling variance smaller than that of 
0 (obtained after replacing omits by random responses). 



In all that follows, we assume that 9 is bounded, at least for any 
group of examinees that we consider testing. Another limitation is that 
we cannot estimate the ability of examinees who answer all items correctly 
or all items incorrectly. We deal only with examinees v;ho give at least 
one right ansv/er and at least one wong answer. Since 9 is bounded, 
the probability that an examinee will be excluded by this limitation 
approaches 0 for large n . 

V/e will assume that is an increasing function of 0 ^ differentia- 

ble, with 0 < c^ < P^ < 1 , c^ being the lower asymptote. These 
assumptions are easily satisfied by all ice or<^inarily used for cognitive 
tests. We wxll avoid using extremely difficult or extremely easy items, 
so v;e can assume that P^ is bounded av;ay from 0 and 1* 

Let us reorder the items so that the s omitted items are numbered 
i 1^2, ...,s . Then equations (a1) and (A2) become respectively 



(A5) 





s u. 1-u. n u. 1-u. 



L 



a 



i=l ^ 



Taking logarithms and dividing by n we can write 

(A5) i log L* = i Z c log ^ - i E log Q. . 1 L u log ^ , 

1=1 ^1 i=l " i=s+l 

(A6) ^lOBL^^^logL^-i Z (u. -c)log^ . 

1=1 1 

Since the are assigned at random with probability c that 

= 1 ^ the quantity Z defined by 

1 ^ 

(AT) Z E ^ Z (u. - c) log ^ 

i=l i 

is the average of s observations, each on a random variable having a mean 
of zero. If s 00 , the variance of (Aj) al^^^ays ^0 (since is bounded 

away from O). Consequently the last term in (A6) alwaj'^s converges in 
probability to zero. 

Thus^ for large n the likelihood function (A2) and the new function 
(Al) tend to the same limit. This result makes the function (Al) a 
plausible function to investigate, even though (Al) is itself not a 
likelihood function. 

Likelihood Equation 

The log likelihood from (A2) is Zji, log \- Z,(l • ujlog Q. . 
^ ^ ^ 1111 I'^ i 

Taking the derivative with respect to 0 gives the result 
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d log L n u. 1 - u. 

1=1 1 1 



where PJ E dP^/d© • Setting this derivative equal to zero yields a 
familiar likelihood equation 



d log L n PI 

1=1 11 



Similarly^ setting d log L^/d© ^ 0 , we obtain from (Al) 

a 



d log n p' 

1=1 1 '1 



When the items are ordered with omitted items first, (AlO) can be written 
d log d log L s PI . 

(All) -15-^ = -dT^ - .\ ("i - <') Ki= ° • 

1=1 11 

Consistency 

Cramer's v;ell-known proof [19U6, section 55 -J] that a likelihood equa- 
tion has ?i solution that converges in probability to the true value 0^ 
as n -4 00 applies with minor modifications to 0 obtained from (A9) 
(the cited proof only covers identically distributed variables). Cramer's 
approach can also be used to prove that 0^ , the solution of (All), con- 
verges in probability to the true value 0^ . V/hen (Ml) is divided by n 

and then d log L /d0 expanded by a three-term Taylor's formula, we ob- 
a 

tain after replacing © by 9* 



ERIC 
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(A12) 



d log L* 

£L 

d9 



9* 



1 1 



where = pA^*) } et cetera, !p| < 1 , 



and 



1 ^ \ 



0 n d9 



d log L 

\-\ — ^ 



d9 



9, 



1 ^ 

where 9. lies between 9* and 9 . The quantity - Z (u. - c)P*'/p.O. 

1 on ^1 I'll 



can now be combined with the first (lowest order) term and neglected when 
n is large. Cramer's argument then shows that 9* converges to . 

Sampling Vnriance and Efficiency 

Let e denote expectation over the population of items, so that 
eu. = P? and Var u. = P?Q? , where P? = P. (9 ) , et cetera. The 

11 111 11^0'' 

asymptotic sampling variance of the maximum likelihood estimator is 



(A15) Var 9 



d log 



,-1 



2i-l 



n n P. p: 



= [ Z I ^ ^ e(u. - P°)(u. 



i=i 0=1 p.q:p.o,. 
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1=1 P, Q , 



Thus, frciii (A12), following Cramer, 



-i d log L 



0 ) = 



d0 



1 ^ 
- ~ Z 

^ i=l 



(u, - c) 



i 

1 '1 



As n increases^ ~ "^^ ^ second tern in the 

denominator converges to zero. Thus^ the entire denominator converges to 



^ dj^ 

n ^ _2 



dO 



= k 



say. 



n Var 0 



Since Var 0 is of order l/n ^ is independent of n to our order 

of approximation, If^ as we suppose^ s/n some constant as n oo ^ 
then the entire numerator is a random variable with zero raean and a 
finite vari^nce^ which we must now proceed to determine. 



The term 



d log L 
1 a 



is a random variable with zero mean and 



Q 



variance 
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1 d log L p 



9 n Var 9 
o 



1 ^ ^' 

The term — 2 (u. - c) has zero mean. When s is fixed and large, 

^ i=l ^ 1^ 

its variance is approximately 



(Al6) 



s H'^ s P^' 

i=l ^ i=l P: Q? 



1 ^1 



since the , i - 1,2, . . s are not used in computing the . For 
fixed s , the covariance betv/een the two numerator terms in (Al^) is 
found from (A9), using the same argument, to be 



n 



Ft' 



(A17) i €[ L (u. - • (u^ - O p4f 1 

1=1 Pj.OjL 0=1*^ 3 3 



n s 



o 
P? 



= i L E e[{(u, - c) - (c - P°)) -i- (u 



n . 



1=1 j=l 
s P° 



1^1 



P*' 



" j=l P°0° 3^3 ^ 

J J 

s P 

" i=i 



approximately for large s and n , 
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We will need the general formulas for any random va-iables y , z , 



s : 



(Al8) Var(y) = e[Var(y!s)] + Var[e(y|s)] , 

(A19) Cov(y,z) = f[Cov(y,z|s)] + Cov[f(y|s), e(z|s)] 

Conveniently, the second term on the right of each formula is zero for the 
present application . Thus, by taking the expectation of (a16) with respect 
to s , we obtain the unconditional asymptotic variance 

T s I*' . 

(A20) Var[ ~ L (u - c ) ^ ] = i c(l - c)S , 

where 

.o'^ 0-2 
s P. n P° 0). 

(A21) S E € 2 ^ _ = Z -~~ , 
i=lP°2Q°2 i.ip02o2 ' 

11 1 

a positive quantity, where 0)^. is the probability that the examinee will 
omit an item with characteristic curve . Similarly, the unconditional 
covariance corresponding to (AI7) is found to be the same quantity, 

- c(l - c)S . 
n ^ ' 

From (AI5) and (A20), we find the unconditional asymptotic variance 
of the numerator in (Al^) to be 

- i c(l - c)S 
o n ^ ' 

Finally, then, the asymptotic variance of \/n(©^ • ©°) is this divided by 

k or 

o 
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(A22) Var ©* = Var 0 [l - c(l - c)S Var 0] 

1 - c(l - c)S^( S P°'^/PiQi) 
" n ^2 • 

. ^ 1 ' 11 
1=1 

It is thus seen that the new method applied to the raw data (eqs . k 
and Al) has a smaller asymptotic sampling error than does the maximum 
likelihood method applied (of necessity) to the filled-in data (eqs. 2 
and A2). The relative efficiency of the MLE is given by the term in 
brackets on the right of (A22).-^ 



^The writer is indebted to Prof- Robert Jennrich for finding an 
error in an earlier version of this conclusion- 
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